Execution and Cache Performance of the Scheduled Dataflow Architecture
نویسندگان
چکیده
This paper presents an evaluation of our Scheduled Dataflow (SDF) Processor. Recent focus in the field of new processor architectures is mainly on VLIW (e.g. IA-64), superscalar and superspeculative architectures. This trend allows for better performance at the expense of an increased hardware complexity and a brute-force solution to the memory-wall problem. Our research substantially deviates from this trend by exploring a simpler, yet powerful execution paradigm that is based on dataflow concepts. A program is partitioned into functional execution threads, which are perfectly suited for our non-blocking multithreaded architecture. In addition, all memory accesses are decoupled from the thread’s execution. Data is pre-loaded into the thread’s context (registers), and all results are post-stored after the completion of the thread’s execution. The decoupling of memory accesses from thread execution requires a separate unit to perform the necessary pre-loads and post-stores, and to control the allocation of hardware thread contexts to enabled threads. The analytical analysis of our architecture showed that we could achieve a better performance than other classical dataflow architectures (i.e., ETS), hybrid models (e.g., EARTH) and decoupled multithreaded architectures (e.g., Rhamma processor). This paper analyzes the architecture using an instruction set level simulator for a variety of benchmark programs. We compared the execution cycles required for programs on SDF with the execution cycles required by the programs on DLX (or MIPS). Then we investigated the expected cache-memory performance by collecting address traces from programs and using a trace-driven cache simulator (Dinero-IV). We present these results in this paper. Category: Processor Architectures, Performance of Systems.
منابع مشابه
Execution And Cache Performance Of A Decoupled Non-Blocking Multithreaded Architecture
In this paper we will present an evaluation of the execution performance and cache behavior of a new multithreaded architecture being investigated by the authors. Our architecture uses non-blocking multithreaded model based on dataflow paradigm. In addition, all memory accesses are decoupled from the thread execution. Data is pre-loaded into the thread context (registers), and all results are p...
متن کاملExecution Performance of the Scheduled Dataflow Architecture (SDF)
This paper presents an evaluation of a nonblocking, decoupled memory/execution, multithreaded architecture known as the Scheduled Dataflow (SDF). Recent focus in the field of new processor architectures is mainly on VLIW (e.g. IA-64), superscalar and superspeculative designs. This trend allows for better performance at the expense of increased hardware complexity, and possibly higher power expe...
متن کاملScheduled Dataflow: Execution Paradigm, Architecture, and Performance Evaluation
ÐIn this paper, the Scheduled Dataflow (SDF) architectureÐa decoupled memory/execution, multithreaded architecture using nonblocking threadsÐis presented in detail and evaluated against Superscalar architecture. Recent focus in the field of new processor architectures is mainly on VLIW (e.g., IA-64), superscalar, and superspeculative designs. This trend allows for better performance, but at the...
متن کاملDesign of cache memories for dataflow architecture
The recent advance in dataflow processing — to combine the dataflow paradigm with the control flow paradigm — has brought out many new challenging issues. This hybrid organization has made it possible to study and adapt familiar control flow concepts such as cache memories within the framework of the dataflow architecture. The concept of cache memory has proven its effectiveness in the von Neum...
متن کاملComparing Execution Performance of Scheduled Dataflow With RISC Processors
In this paper we describe a new approach to designing multithreaded architecture that can be used as the basic building blocks in high-end computing architectures. Our architecture uses non-blocking multithreaded model based on dataflow paradigm. In addition, all memory accesses are decoupled from the thread execution. Data is pre-loaded into the thread context (registers), and all results are ...
متن کاملذخیره در منابع من
با ذخیره ی این منبع در منابع من، دسترسی به آن را برای استفاده های بعدی آسان تر کنید
عنوان ژورنال:
- J. UCS
دوره 6 شماره
صفحات -
تاریخ انتشار 2000